Finding Information in Audio: a New Paradigm for Audio Browsing and Retrieval

نویسندگان

  • Julia Hirschberg
  • Steve Whittaker
  • Don Hindle
  • Fernando Pereira
  • Amit Singhal
چکیده

Information retrieval from audio data is sharply different from information retrieval from text, not simply because speech recognition errors affect retrieval effectiveness, but more fundamentally because of the linear nature of speech, and of the differences in human capabilities for processing speech versus text. We describe SCAN, a prototype speech retrieval and browsing system that addresses these challenges of speech retrieval in an integrated way. On the retrieval side, we use novel document expansion techniques to improve retrieval from automatic transcription to a level competitive with retrieval from human transcription. Given these retrieval results, our graphical user interface, based on the novel WYSIAWYH (“What you see is almost what you hear”) paradigm, infers text formatting such as paragraph boundaries and highlighted words from acoustic information and information retrieval term scores to help users navigate the errorful automatic transcription. This interface supports information extraction and relevance ranking demonstrably better than simple speech-alone interfaces, according to results of empirical studies.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Beyond the Query-By-Example Paradigm: New Query Interfaces for Music Information Retrieval

The majority of existing work in music information retrieval for audio signals has followed the content-based query-by-example paradigm. In this paradigm a musical piece is used as a query and the result is a list of other musical pieces ranked by their content similarity. In this paper we describe algorithms and graphical user interfaces that enable novel alternative ways for querying and brow...

متن کامل

Enhancing Sonic Browsing Using Audio Information Retrieval

Collections of sound and music of increasing size and diversity are used both by typical computer users and multimedia designers. Browsing audio collections poses several challenges to the design of effective user interfaces. Recent techniques in audio information retrieval allow the automatic extraction of audio content information. This information can be used to inform and enhance audio brow...

متن کامل

"I Just Played That A Minute Ago!" - Designing User Interfaces For Audio Navigation

The current popularity of multimodal information retrieval research critically assumes that consumers will be found for the multimodal information thus retrieved and that interfaces can be designed that will allow users to search and browse multimodal information effectively. While there has been considerable effort given to developing the basic technologies needed for information retrieval fro...

متن کامل

Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica

Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...

متن کامل

Spoken Content-Based Audio Navigation (SCAN)

We describe SCAN, a system for retrieving and browsing speech documents from large audio corpora that uses new information retrieval and speech processing techniques to create easily navigable presentations of documents relevant to a user query. Experiments show that the new interface is more effective than simple speechalone interfaces.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999